PTMs and half-lives

Check that phosphorylation is the most abundant (literature).

Proteins with a short half-life

Proteins can have varying half-lives

  • What do the half-lives depend on?

  • How are they measured?

Below is a comparison of the distribution of the half-lives that was found in literature and the distribution of a subset of those half-lives in the proteins found in the dataset.

This is to check the proteins present in a particular half-life interval.

These are the modifications for a particular protein.

I want to remove the proteins with a very high number of log10(counts_norm_abund_len).

Outliers

Detecting the outliers:

Check that the outliers are removed:

What is the resulting distribution.

SRRM2_HUMAN

2752 amino acids.

There is a huge peak if you look at the data that is only normalised by the number of raw files.

Looking at SRRM2_HUMAN in more detail.

What are the most common modifications in this protein?

PTMs

Using genes from GenAge is ligit. Can continue doing that.

Prediction and characterization of human ageing-related proteins by using machine learning | Scientific Reports (nature.com)

PTMs of interest:

  • PTMs that control autophagy

    • phosphorylation

    • ubiquitination -> need to use the new dataset

    • acetylation

  • oxPTMs

    • you have a list of these
  • Methylation eg of histones

  • Acylation -> need to get this from the paper.

Phosphorylation

This is already without outliers

  • Only the modification [21]Phospho is present here.

Splitting the dataset in a group with phosphorylation proteins and another group with all remaining proteins.

It is not necessary to include another density line with all of the proteins. You can just compare the two distributions.

Comparison

Testing whether the half-lives between groups are significantly different

Enriched proteins in both datasets:

Proteins that are only present in one of the dataframes.

Acetylation

  • Filtered by the [1]Acetyl modification.

Ubiquitination

Ubiquitination has the classification ‘Other’. Take that as one group. The second group is all of the PTMs. 890 proteins overlap so you have 289 proteins taht are not ubiquitinated and have PTMs and we know their half-lives. These make up the second group.

Methylation

  • Filtered by the [34]Methyl modification

Violin plots

Enrichment:

oxPTMs

This is only for proteins that are related to ageing.

All PTMs related to oxidative damage in general, not only oxidation.

Lysine acylations

Violin plot

AGEs

Violin plots

Binning

Hypothesis: The higher the half-life, the greater the number of PTMs.

Phosphorylation

oxPTMs

methylation

Ubiquitination, acetylation, lysine, AGEs

Check the number of proteins in each bin.

# A tibble: 5 × 2
  hl_group protein_count
  <chr>            <int>
1 0-5                236
2 10-15              258
3 15-20              183
4 20+                286
5 5-10               228

oxPTMs

`summarise()` has grouped output by 'hl_group'. You can override using the
`.groups` argument.
# A tibble: 15 × 3
# Groups:   hl_group [5]
   hl_group mod_group      protein_count
   <chr>    <chr>                  <int>
 1 0-5      -                        235
 2 0-5      Phosphorylated           166
 3 0-5      oxPTMs                   227
 4 10-15    -                        256
 5 10-15    Phosphorylated           207
 6 10-15    oxPTMs                   254
 7 15-20    -                        182
 8 15-20    Phosphorylated           147
 9 15-20    oxPTMs                   184
10 20+      -                        286
11 20+      Phosphorylated           236
12 20+      oxPTMs                   285
13 5-10     -                        223
14 5-10     Phosphorylated           162
15 5-10     oxPTMs                   221

Proteins with a long half-life

Long-lived proteins can be used as estimators of chronological age. Long-lived proteins can be defined in different ways, for example based on the half-life of the protein when compared to the average half-life of proteins in the organism. In this case, long-lived proteins were obtained from the following study: paper. Proteins were classified as long-lived based on their degree of degradation during the experiment and therefore it was possible to discover new long-lived proteins (no a priori assumptions were made).

The study identified a list of long-lived proteins in rats, therefore human orthologs of these proteins were found.

Plot the data distributions

Outliers

Checking that the outliers have been removed.

All of the outliers have been removed.

Check the distribution of the half-lives:

Remove the proteins with very large half-lives:

Now the exact same thing but for `human_complete_hl_long`

PTMs

Phosphorylation

Violin plot:

Acetylation

Comparison:

Ubiquitination

Comparison

Methylation

oxPTMs

Comparing

Lysine acylations

AGEs

Violin plots:

Binning

oxPTMs: